Corpus-Driven Contextualized Categorization

نویسندگان

  • Tony Veale
  • Yanfen Hao
چکیده

Ontologies strive to offer a interconnected, hierarchical systems of categories to guide our actions in a complex world. But the boundaries of these categories are highly context-dependent, and what constitutes a prototypical category member in one context may be atypical or unrepresentative in another. In this paper we outline a dynamic, trainable, bottom-up view of category structure based on context-sensitive corpus analysis. By learning from corpora about how people creatively actually use categories in different contexts, we can train our ontologies to creatively adapt themselves to these

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concordance-Based Data-Driven Learning Activities and Learning English Phrasal Verbs in EFL Classrooms

In spite of the highly beneficial applications of corpus linguistics in language pedagogy, it has not found its way into mainstream EFL. The major reasons seem to be the teachers’ lack of training and the unavailability of resources, especially computers in language classes. Phrasal verbs have been shown to be a problematic area of learning English as a foreign language due to their semantic op...

متن کامل

A Corpus-Driven Study of the Variation of Co-Occurrence Patterns in Written and Spoken Registers

This paper will focus on the study of the variation of co-occurrence patterns encountered in written and spoken registers, through the analysis of a large lexical database of corpus-extracted multiword expressions (MWEs) of European Portuguese. Those MWEs were automatically extracted from a balanced 50 million word written corpus and a 1 million word spoken corpus, furthermore statistically int...

متن کامل

Sense Contextualization in a Dependency-Based Compositional Distributional Model

Little attention has been paid to distributional compositional methods which employ syntactically structured vector models. As word vectors belonging to different syntactic categories have incompatible syntactic distributions, no trivial compositional operation can be applied to combine them into a new compositional vector. In this article, we generalize the method described by Erk and Padó (20...

متن کامل

A Wikipedia-based Corpus for Contextualized Machine Translation

We describe a corpus for and experiments in target-contextualized machine translation (MT), in which we incorporate language models from target-language documents that are comparable in nature to the source documents. This corpus comprises (i) a set of curated English Wikipedia articles describing news events along with (ii) their comparable Spanish counterparts, (iii) a number of the Spanish s...

متن کامل

Transcultural categorization in contextualized domains

Introduction. This study takes classifications of musical instruments from three different cultural regions to show that the model of knowledge organization in use is not appropriated for cultural integration. Method. The set of categories used for the analysed instruments have been taken from previous work of M. Kartomi and M. López-Huertas. Analysis. The selected categories have been processe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006